-
-
Notifications
You must be signed in to change notification settings - Fork 131
A few fixes for search. #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Focus the search box when you click "search" in the header * There were some bad codegen things -- trying to access ctx.db inside an action.
|
@jamwt is attempting to deploy a commit to the Amantus Machina Team on Vercel. A member of the Team first needs to authorize it. |
…stat sets. Will prevent more invalidations of the /skills page.
|
Putting back to draft. I don't want to merge this just yet. |
| } | ||
|
|
||
| // Update cursor for next batch fetch | ||
| cursor = events[events.length - 1]._creationTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Events with the same _creationTime can be lost due to cursor update logic. The code updates the working cursor to the last event's creation time, but when events share the same creation time timestamp, the next run will skip them because the query uses .gt() (strictly greater than) instead of inclusive comparison.
View Details
📝 Patch Details
diff --git a/convex/skillStatEvents.ts b/convex/skillStatEvents.ts
index 03d6722..03f2ddc 100644
--- a/convex/skillStatEvents.ts
+++ b/convex/skillStatEvents.ts
@@ -290,6 +290,15 @@ const MAX_SKILLS_PER_RUN = 500
/**
* Fetch a batch of events after the given cursor (by _creationTime).
* Returns events sorted by _creationTime ascending.
+ *
+ * Uses .gte() (greater than or equal) instead of .gt() to ensure we don't skip
+ * events that share the same _creationTime as the cursor. Since events can be
+ * created within the same millisecond, using .gt() would lose events at the
+ * boundary when batches are split by skill limits.
+ *
+ * To avoid reprocessing: callers must use cursorCreationTime from a previous
+ * complete run (exhausted = true), or must be prepared to re-aggregate events
+ * with the same skillId (which is safe since aggregation is idempotent).
*/
export const getUnprocessedEventBatch = internalQuery({
args: {
@@ -300,11 +309,13 @@ export const getUnprocessedEventBatch = internalQuery({
const limit = args.limit ?? EVENT_BATCH_SIZE
const cursor = args.cursorCreationTime
- // Query events after the cursor using the built-in creation time index
+ // Query events at or after the cursor using the built-in creation time index.
+ // We use .gte() (>=) instead of .gt() (>) to handle the case where multiple
+ // events share the same _creationTime but are split across batch boundaries.
const events = await ctx.db
.query('skillStatEvents')
.withIndex('by_creation_time', (q) =>
- cursor !== undefined ? q.gt('_creationTime', cursor) : q,
+ cursor !== undefined ? q.gte('_creationTime', cursor) : q,
)
.take(limit)
return events
Analysis
Cursor-based pagination loses events with duplicate _creationTime timestamps
What fails: The processSkillStatEventsAction function in convex/skillStatEvents.ts can permanently lose unprocessed skill stat events when multiple events share the same _creationTime value and batch processing is interrupted by the skill limit.
How to reproduce:
- Insert stat events where multiple events have the same
_creationTimetimestamp (possible in Convex since events can be created within the same millisecond) - Ensure enough events exist at that timestamp to span across batch boundaries
- Configure
MAX_SKILLS_PER_RUNsmall enough that processing stops mid-batch (e.g., 2-3 unique skills) - Run the action - it processes the first batch and hits the skill limit
- The cursor is saved as the maximum
_creationTimeseen - Run the action again - the query uses
.gt()(strictly greater than) to fetch the next batch - Events with
_creationTimeequal to the saved cursor are permanently skipped
Result: Events that were never processed are lost, resulting in permanently missing stat updates for those events.
Expected: All events should be processed exactly once, even when they share the same _creationTime timestamp.
Root cause: The getUnprocessedEventBatch query at line 307 uses .gt('_creationTime', cursor) which performs a strictly greater-than comparison. When events are batched and the batch boundary falls in the middle of a group of events with the same _creationTime, subsequent queries will skip those boundary events.
Fix applied: Changed the query operator from .gt() to .gte() (greater than or equal) in the getUnprocessedEventBatch query function. This ensures events at the cursor timestamp are included in subsequent batches. The aggregation logic is idempotent (processing the same skill's events twice produces the same result), so minor reprocessing of boundary events is safe.
See: Convex index documentation on comparison operators - .gte() performs greater-than-or-equal comparison as opposed to .gt()'s strictly greater-than behavior.
Uh oh!
There was an error while loading. Please reload this page.